[NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model #2636

AdiKsOnDev · 2024-04-16T15:09:21Z

Changes

Added the INT8 compression test suite to the model_scope
Added TORCH backend support in LMWeightCompression class
For INT8 compression, dataset, as well as some other parameters (see model_scope) are set to None
metric_value has been set to 0.95944
Mainly use save_pretrained() for TORCH models
Omitted a few method calls that are not supported for TORCH models (Check the commits for details)

Reason for changes

Requested to Benchmark changes via whowhatbench in issue #2527

Related tickets

ref: 130788
Closes #2527

Tests

Added INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model

TODO: Make the test

compress() and _compress_torch() methods were implemented

AdiKsOnDev · 2024-04-16T15:10:31Z

@alexsu52 Requesting review as per @MaximProshin 's guideline
Unfortunately, I can't assign the PR myself because I have no necessary rights

codecov · 2024-04-16T16:45:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 29.95%. Comparing base (9c00000) to head (e5db8cc).
Report is 1 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff              @@
##           develop    #2636       +/-   ##
============================================
- Coverage    91.21%   29.95%   -61.26%     
============================================
  Files          494      494               
  Lines        45775    45775               
============================================
- Hits         41753    13713    -28040     
- Misses        4022    32062    +28040

see 330 files with indirect coverage changes

Flag	Coverage Δ
COMMON	`?`
ONNX	`?`
OPENVINO	`?`
TENSORFLOW	`29.95% <ø> (ø)`
TORCH	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
common	`76.35% <ø> (-17.42%)`	⬇️
torch	`0.01% <ø> (-93.61%)`	⬇️
tensorflow	`93.74% <ø> (ø)`
onnx	`0.00% <ø> (-93.07%)`	⬇️
openvino	`0.00% <ø> (-94.23%)`	⬇️
ptq	`15.26% <ø> (-74.91%)`	⬇️

AdiKsOnDev · 2024-04-18T22:30:32Z

@alexsu52 I think I fixed the code, could you please approve the workflow?

alexsu52

Please check that you have pushed all the changes because I don't see the changes you mentioned in the description.

Please, provide local results of the test run.

AdiKsOnDev · 2024-04-19T15:38:19Z

Please check that you have pushed all the changes because I don't see the changes you mentioned in the description.

Good evening. Sorry, those initial changes are irrelevant. I changed the code a bit because the initial code was not passing the pipeline.

AdiKsOnDev · 2024-04-19T15:44:06Z

Please, provide local results of the test run.

Will send screenshots in a bit

AdiKsOnDev · 2024-04-19T15:55:58Z

@alexsu52

Command

pytest tests/post_training/test_quantize_conformance.py::test_weight_compression -s --data=tests/post_training/data/ -k tinyllama_int8_data_free_backend_PT

Output

alexsu52 · 2024-04-22T06:12:52Z

@alexsu52

Command

pytest tests/post_training/test_quantize_conformance.py::test_weight_compression -s --data=tests/post_training/data/ -k tinyllama_int8_data_free_backend_PT

Output

Your command does not run any test. There are 0 selected tests in your screenshot. I run your test using the following command:

 pytest tests/post_training/test_quantize_conformance.py::test_weight_compression -s --data=tests/post_training/data/ -k tinyllama_int8_data_free_backend_TORCH

and got the following error:

alexsu52

The tinyllama_int8_data_free_backend_TORCH test failed with runtime error.

The general comment: Please add a description of the test case you implemented to the PR description.

AdiKsOnDev · 2024-04-22T07:08:06Z

@alexsu52 Oh, I just used the command provided in the issue, I'll fix it and tag you asap. Thanks for the feedback

TODO: Maybe make it in a way where I check for INT8 instead of BackendType.TORCH,

Co-authored-by: Aleksander <[email protected]>

AdiKsOnDev · 2024-04-22T21:07:28Z

@alexsu52 Hi, could you verify the logic I followed? Looks good so far. You can check the updated PR description for a heads-up

alexsu52

We have the similar test pipeline for PyTorch model in PTQ test:

nncf/tests/post_training/test_quantize_conformance.py

Line 199 in fa1a4ce

def test_ptq_quantization(

You need to reproduce the same pipeline for weight compression test.

tests/post_training/pipelines/lm_weight_compression.py

AdiKsOnDev · 2024-04-23T05:04:17Z

We have the similar test pipeline for PyTorch model in PTQ test:

nncf/tests/post_training/test_quantize_conformance.py

Line 199 in fa1a4ce

def test_ptq_quantization(

You need to reproduce the same pipeline for weight compression test.

Will do

AdiKsOnDev · 2024-05-01T05:42:03Z

@alexsu52 Good morning, any updates? Do you need me to help you with anything?

alexsu52 · 2024-05-01T11:28:36Z

@alexsu52 Good morning, any updates? Do you need me to help you with anything?

Hi, I tried to run your PR and got a runtime error. I'll come back with comments after I've done some experiments.

AdiKsOnDev · 2024-05-01T11:31:10Z

@alexsu52 Good morning, any updates? Do you need me to help you with anything?

Hi, I tried to run your PR and got a runtime error. I'll come back with comments after I've done some experiments.

Oh that's weird. I'll try to run it as well

AdiKsOnDev · 2024-05-01T12:04:09Z

@alexsu52 I removed the problematic line of code, I must have returned it by accidentally undo-ing right before the commit

AdiKsOnDev · 2024-05-02T09:01:29Z

@alexsu52 did it run?

alexsu52

As I understand, based on your implementation, you are trying to implement the following test flow:

Create FP32 PyTorch HF model
Export FP32 PyTorch HF model to FP32 OpenVINO model and save to fp32_model_dir
Compress FP32 PyTorch HF model to INT8
Export INT8 PyTorch HF model to INT8 OpenVINO model and save to output_model_dir
Calculate the number of int8 and int4 operations by INT8 OpenVINO model.
Check the number of in8 and int4 operations with references.
Calculate the similarity metric between FP32 OpenVINO model and INT8 OpenVINO model. The similarity metric is calculated between OpenVINO models for inference optimization on CPU.
Check the similarity metric with reference.

Is this correct statement?

tests/post_training/data/wc_reference_data.yaml

tests/post_training/pipelines/lm_weight_compression.py

alexsu52 · 2024-05-02T10:43:37Z

@alexsu52 did it run?

Thanks for your update. Please pay attention to my comments.

AdiKsOnDev · 2024-05-02T10:45:21Z

@alexsu52 did it run?

Thanks for your update. Please pay attention to my comments.

Good day! Yup, thanks for the review

AdiKsOnDev · 2024-05-02T10:55:35Z

As I understand, based on your implementation, you are trying to implement the following test flow:

Create FP32 PyTorch HF model

Export FP32 PyTorch HF model to FP32 OpenVINO model and save to fp32_model_dir

Compress FP32 PyTorch HF model to INT8

Export INT8 PyTorch HF model to INT8 OpenVINO model and save to output_model_dir

Calculate the number of int8 and int4 operations by INT8 OpenVINO model.

Check the number of in8 and int4 operations with references.

Calculate the similarity metric between FP32 OpenVINO model and INT8 OpenVINO model. The similarity metric is calculated between OpenVINO models for inference optimization on CPU.

Check the similarity metric with reference.

Is this correct statement?

Yes, except in the Step 4. I didn't export it as OpenVINO model

Co-authored-by: Alexander Suslov <[email protected]>

Also deleted the following class attributes: MODEL_NAME MODEL_FUNC

Co-authored-by: Alexander Suslov <[email protected]>

Utilization of export_from_model() function from Optimum Co-authored-by: Alexander Suslov <[email protected]>

TORCH Backends only

AdiKsOnDev · 2024-05-02T11:24:49Z

@alexsu52 Thanks for a detailed review, I implemented all the changes and cleaned up the remaining code, could you review it one last time?

AdiKsOnDev · 2024-05-02T11:30:03Z

@alexsu52 FYI, here's the output:

alexsu52 · 2024-05-02T12:03:59Z

@alexsu52 FYI, here's the output:

I have run internal build for validation your changes. It takes some time.

AdiKsOnDev · 2024-05-02T12:08:49Z

@alexsu52 FYI, here's the output:

I have run internal build for validation your changes. It takes some time.

Yup I see it, should be done in a few minutes

UPD: @alexsu52 Pipeline Passed

alexsu52

LGTM. Thanks for the contribution!

build: manual/job/post_training_weight_compression/57

AdiKsOnDev · 2024-05-02T13:07:05Z

Thanks for guidance, have a great day

AdiKsOnDev and others added 3 commits April 9, 2024 14:47

feat: Added to the test scope

2d6fbe5

TODO: Make the test

feat: Added torch backend support

52e8180

compress() and _compress_torch() methods were implemented

Merge branch 'openvinotoolkit:develop' into develop

fd48363

AdiKsOnDev requested a review from a team as a code owner April 16, 2024 15:09

github-actions bot added the NNCF PTQ Pull requests that updates NNCF PTQ label Apr 16, 2024

MaximProshin requested a review from alexsu52 April 16, 2024 15:29

AdiKsOnDev added 2 commits April 18, 2024 21:17

fix: Moved int8 conversion in _validate()

c024803

git: Merge branch 'develop' of github.com:AdiKsOnDev/nncf into develop

6405c4e

alexsu52 requested changes Apr 19, 2024

View reviewed changes

AdiKsOnDev requested a review from alexsu52 April 19, 2024 16:14

alexsu52 requested changes Apr 22, 2024

View reviewed changes

AdiKsOnDev and others added 6 commits April 22, 2024 21:41

fix: Returned initial implementation of _validate()

f48c148

chore: Temporary dummy data

f9505e4

fix: Model Preparation for TORCH backend

2bc73ec

fix: Removed unsupported parameters for INT8

927c38f

TODO: Maybe make it in a way where I check for INT8 instead of BackendType.TORCH,

chore: Comment on important addition

f008103

feat: Added correct metric value according to @aleksu52

eeade47

Co-authored-by: Aleksander <[email protected]>

AdiKsOnDev requested a review from alexsu52 April 22, 2024 21:06

alexsu52 requested changes Apr 23, 2024

View reviewed changes

tests/post_training/pipelines/lm_weight_compression.py Outdated Show resolved Hide resolved

fix: Mode accurate check for the INT8 compression mode

fc05eed

AdiKsOnDev and others added 4 commits April 30, 2024 11:48

chore: Pre-Commit changes

a72ae7e

git: Merge main branch

2f6f69c

Merge branch 'develop' into develop

d90b356

refactor: Pre-Commit Changes

7d328c3

fix: Removed the debugging line

7e50cfa

alexsu52 requested changes May 2, 2024

View reviewed changes

AdiKsOnDev and others added 7 commits May 2, 2024 14:57

fix: Corrected reference data for TORCH backend

7c31d3d

Co-authored-by: Alexander Suslov <[email protected]>

refactor: Code made cleaner

6899097

Also deleted the following class attributes: MODEL_NAME MODEL_FUNC

fix: Utilized wikitext for TORCH models as well

86e91f9

Co-authored-by: Alexander Suslov <[email protected]>

feat: Implemented get_num_compressed

7f32430

Co-authored-by: Alexander Suslov <[email protected]>

fix: Dumping the fp32 model correctly

7729867

Utilization of export_from_model() function from Optimum Co-authored-by: Alexander Suslov <[email protected]>

chore: Removed unneccesary model wrapping

70cd912

TORCH Backends only

fix: Changed _validate to match the modified pipeline

e5db8cc

AdiKsOnDev requested a review from alexsu52 May 2, 2024 11:23

alexsu52 approved these changes May 2, 2024

View reviewed changes

alexsu52 merged commit ba7e1a4 into openvinotoolkit:develop May 2, 2024
12 checks passed

daniil-lyakhov mentioned this pull request Aug 19, 2024

[TorchFX] INT8 Weights Compression Support #2891

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model #2636

[NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model #2636

AdiKsOnDev commented Apr 16, 2024 •

edited

Loading

AdiKsOnDev commented Apr 16, 2024 •

edited

Loading

codecov bot commented Apr 16, 2024 •

edited

Loading

AdiKsOnDev commented Apr 18, 2024

alexsu52 left a comment •

edited

Loading

AdiKsOnDev commented Apr 19, 2024

AdiKsOnDev commented Apr 19, 2024

AdiKsOnDev commented Apr 19, 2024 •

edited

Loading

alexsu52 commented Apr 22, 2024

Command

Output

alexsu52 left a comment

AdiKsOnDev commented Apr 22, 2024

AdiKsOnDev commented Apr 22, 2024 •

edited

Loading

alexsu52 left a comment

AdiKsOnDev commented Apr 23, 2024

AdiKsOnDev commented May 1, 2024

alexsu52 commented May 1, 2024 •

edited

Loading

AdiKsOnDev commented May 1, 2024 •

edited

Loading

AdiKsOnDev commented May 1, 2024

AdiKsOnDev commented May 2, 2024

alexsu52 left a comment •

edited

Loading

alexsu52 commented May 2, 2024

AdiKsOnDev commented May 2, 2024

AdiKsOnDev commented May 2, 2024 •

edited

Loading

AdiKsOnDev commented May 2, 2024

AdiKsOnDev commented May 2, 2024

alexsu52 commented May 2, 2024

AdiKsOnDev commented May 2, 2024 •

edited

Loading

alexsu52 left a comment •

edited

Loading

AdiKsOnDev commented May 2, 2024

[NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model #2636

[NNCF]: Add INT8 weight compression conformance test for Tinyllama-1.1b PyTorch model #2636

Conversation

AdiKsOnDev commented Apr 16, 2024 • edited Loading

Changes

Reason for changes

Related tickets

Tests

AdiKsOnDev commented Apr 16, 2024 • edited Loading

codecov bot commented Apr 16, 2024 • edited Loading

Codecov Report

AdiKsOnDev commented Apr 18, 2024

alexsu52 left a comment • edited Loading

Choose a reason for hiding this comment

AdiKsOnDev commented Apr 19, 2024

AdiKsOnDev commented Apr 19, 2024

AdiKsOnDev commented Apr 19, 2024 • edited Loading

Command

Output

alexsu52 commented Apr 22, 2024

Command

Output

alexsu52 left a comment

Choose a reason for hiding this comment

AdiKsOnDev commented Apr 22, 2024

AdiKsOnDev commented Apr 22, 2024 • edited Loading

alexsu52 left a comment

Choose a reason for hiding this comment

AdiKsOnDev commented Apr 23, 2024

AdiKsOnDev commented May 1, 2024

alexsu52 commented May 1, 2024 • edited Loading

AdiKsOnDev commented May 1, 2024 • edited Loading

AdiKsOnDev commented May 1, 2024

AdiKsOnDev commented May 2, 2024

alexsu52 left a comment • edited Loading

Choose a reason for hiding this comment

alexsu52 commented May 2, 2024

AdiKsOnDev commented May 2, 2024

AdiKsOnDev commented May 2, 2024 • edited Loading

AdiKsOnDev commented May 2, 2024

AdiKsOnDev commented May 2, 2024

alexsu52 commented May 2, 2024

AdiKsOnDev commented May 2, 2024 • edited Loading

alexsu52 left a comment • edited Loading

Choose a reason for hiding this comment

AdiKsOnDev commented May 2, 2024

AdiKsOnDev commented Apr 16, 2024 •

edited

Loading

AdiKsOnDev commented Apr 16, 2024 •

edited

Loading

codecov bot commented Apr 16, 2024 •

edited

Loading

alexsu52 left a comment •

edited

Loading

AdiKsOnDev commented Apr 19, 2024 •

edited

Loading

AdiKsOnDev commented Apr 22, 2024 •

edited

Loading

alexsu52 commented May 1, 2024 •

edited

Loading

AdiKsOnDev commented May 1, 2024 •

edited

Loading

alexsu52 left a comment •

edited

Loading

AdiKsOnDev commented May 2, 2024 •

edited

Loading

AdiKsOnDev commented May 2, 2024 •

edited

Loading

alexsu52 left a comment •

edited

Loading